Structural Bayesian language modeling and adaptation
نویسندگان
چکیده
We propose a language modeling and adaptation framework using Bayesian structural maximum a posteriori (SMAP) principle, in which each n-gram event is embedded in a branch of a tree structure. The nodes in the first layer of this tree structure represent the unigrams, and those in the second layer represent the bigrams, and so on. Each node in the tree structure has an associated hyper-parameter representing the information about the prior distribution, and a count representing the number of times the word sequence occurs in the domain-specific data. In general, the hyper-parameters depend on the observation frequency of not only the node event but also its parent node of lower order n-gram event. Our automatic speech recognition experiments using the Wall Street Journal corpus verify that the proposed SMAP language model adaptation achieves a 5.6% relative improvement over maximum likelihood language models obtained with the same training and adaptation data sets.
منابع مشابه
A Hierarchical Bayesian Approach for Semi-supervised Discriminative Language Modeling
Discriminative language modeling provides a mechanism for differentiating between competing word hypotheses, which are usually ignored in traditional maximum likelihood estimation of N-gram language models. Discriminative language modeling usually requires manual transcription which can be costly and slow to obtain. On the other hand, there are vast amount of untranscribed speech data on which ...
متن کاملModeling Structural Relationships Between Epistemological Beliefs and Mediating Learning Strategies on Anxiety in English Students
Introduction :The purpose of this study was to investigate the modeling of modeling structural relationships between epistemological beliefs and mediating learning strategies on the English language anxiety of third-year high school girl students in Babol. Methods:Correlation research was based on structural equation modeling. The statistical population consisted of 3rd grade high school girl s...
متن کاملLanguage Proficiency and Identity: Developing a Structural Equation Modeling (SEM) of Identity for Iranian EFL Learners
This study was an endeavor to develop a model of identity among Iranian EFL learners. To achieve this end, a multiphase design was implemented. Initially, it attempted to investigate different factors of identity to propose and validate a model. Thus, 120 EFL learners studying in different English language institutes in Iran were randomly selected, and 36 learners were interviewed about their v...
متن کاملA Model of Iranian EFL Learners\' Cultural Identity: A Structural Equation Modeling Approach
This study aimed, firstly, to investigate the underlying components of Iranian cultural identity and, secondly, to confirm the aforementioned components via Structural Equation Modeling (SEM) analysis. In order to achieve these goals, the researchers reviewed the extensive local and international literature on language, culture and identity. Based on the literature and consultations with a grou...
متن کاملOptimal on-line Bayesian model selection for speaker adaptation
In this paper, we show how to accomodate a Bayesian variant of Rissanen’s MDL into on-line Bayesian adaptation to control both model structural complexity and parameterization complexity to best fit an available amount of adaptation data, the goal being minimization of resulting recognition error. An efficient bottom-up dynamic programming based pruning algorithm is developed for selecting mode...
متن کامل